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ABSTRACT 

The CORNISH project is the highest resolution radio continuum survey of the Galactic plane to date. 
It is the 5 GHz radio continuum part of a series of multi-wavelength surveys that focus on the northern 
GLIMPSE region (10° < I < 65°), observed by the Spitzer satellite in the mid-infrared. Observations with 
the Karl G. Jansky Very Large Array (VLA) in B and BnA configurations have yielded a 1.5" resolution 
Stokes / map with a root-mean-squared noise level better than 0.4 mJy beam -1 . Here we describe the 
data-processing methods and data characteristics, and present a new, uniform catalogue of compact radio- 
emission. This includes an implementation of automatic deconvolution that provides much more reliable 
imaging than standard CLEANing. A rigorous investigation of the noise characteristics and reliability of 
source detection has been carried out. We show that the survey is optimised to detect emission on size scales 
up to 14" and for unresolved sources the catalogue is more than 90 percent complete at a flux density of 
3.9 mJy. We have detected 3,062 sources above a 7a detection limit and present their ensemble properties. 
The catalogue is highly reliable away from regions containing poorly-sampled extended emission, which 
comprise less than two percent of the survey area. Imaging problems have been mitigated by down- 
weighting the shortest spacings and potential artefacts flagged via a rigorous manual inspection with 
reference to the Spitzer infrared data. We present images of the most common source types found: H n 
regions, planetary nebulae and radio-galaxies. The CORNISH data and catalogue are available online at 



http: / / cornish.leeds.ac.uk 
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1. Introduction 

The observed progression of massive star formation, 
from cold collapsing core to young OB clusters, is largely 
understood via observations of discrete examples that 
have been ordered into an evolutionary sequence. Key 
to separating objects of different age and type are mea- 
surements of their spectral energy distributions (SEDs) 
at sub-millimetre, infrared and radio wavelengths. 

The Spitzer GLIMPSE (Galactic Legacy Infrared Mid- 
Plane Survey Extraordinaire) programme is the first of 
a number of sensitive infrared surveys covering the in- 
ner Gala ctic plane at high resolu tion and in an unbiased 



Kurtz et all Il994 lUrquhart et~aH |2009l) or are limited 
in th e ir resolution and s ky-coverage (e.g., iBecker et al 
1994 IWhite et all 120051) . " From a star formation per 



manner ( Churchwell et al 



20091 ). The northern half of 
GLIMPSE covers the region 10° < I < 65°, |6| < 1° 
at wavelengths spanning 3.6 /mi- 8.0 /mi, which preferen- 
tially selects warm and dusty embe dded sources. The 
companion Spitzer MIPSGAL survey ( Carey et al.ll2009l) 
has imaged the same region at 24 /tm and 70 /im (where 
the bulk of the energy from massive young stellar ob- 
jects is emitted) and is hence sensitive to cooler and 
more deeply embedded young stellar objects. Most re- 
cently, thfi_He7^c/^£Infi^red Galactic Plane survey (Hi- 
GAL, iMolinari et al]|2010h is delivering the most com- 
prehensive survey of embedded objects to date. With 
observations in six far-infrared bands between 70 /mi and 
500 /im, Hi-GAL samples the peak of the star-forming 
SED and covers the northern GLIMPSE region out to I = 
60. Completing the infrared picture of Galactic star for- 
mation is the UKIDS S3 project (UK IR Deep Sky Survey, 
Lawrence et~al1l2007l). A subset of UKIDSS (the Galactic 
Plane Survey. iLucas et al" 20081 ) has observed the north- 
ern GLIMPSE region in the near-infrared J, H and K 
bands and is sensitive to objects down to 18th magni- 
tude. The combined data from these surveys are driv- 
ing the detailed characterisation o f the Galactic popula- 
tion via their inf r ared colours (e.g., Robitaille et al.ll2007 
Arvidsson et~al1 120101 ISmith et all l2010i IWright et al. 
2010lMottram et al. 2011 ). A complementary picture of 



the molecular and atomic interstellar medium is being 
prov ided by the BU-FCR AO Galactic Ring Survey for 
CO (| Jackson et al.l 12006) and the VLA Galactic Plane 
Survey (VGPS) for Hi dstil et al.ll2006l) . Similarly, the 
ongoing Isac Newton Telescope Phot ometric Suryey o f 
the Northern Galactic Plane (IPHAS) (|Drew et alj|2005h 
probes Ha in emission towards nebulae, and in both ab- 
sorption and emission towards stars. T he UKIRT Wide 
Field Infrared Survey for H 2 (UWISH2, iFroebrich et al 



1 2 llh also covers the same GLIMPSE region in molecular 
hydrogen (2.122 /tm line) highlighting regions of shocked 
or fluorescently excited molecular gas (Tss 2000 K, nn 2 > 
10 3 cm -3 ). 

Conspicuous by its absence is a comparable radio con- 
tinuum survey for compact ionised gas. Previous sur- 
veys are either targeted at individual sources selected 
via infrared colours (e.g., Wood fc Churchwell 19891 



spective, the presence or absence of free-free emission 
is vital to distinguish the more evolved ultra-compact 
Hn (UCHu) regions from their younger counterpart s 
with similar thermal SEDs ([Urquhart et al.ll2009ll201]|) . 
The sheer number density of sources in the near and 
mid-infrared surveys necessitates complementary data 
at similarly high resolution to enable the full science po- 
tential to be fulfilled. This is particularly true in highly 
clustered star forming regions. It is important that any 
radio-continuum survey for UCH n regions be carried out 
at relatively high frequencies (>5 GHz) where thermal 
free-free emission is optically thin with a spectral index of 
Si, oc v . At lower frequencies the spectrum becomes 
optically thick with S v cx v . High-frequency observa- 
tions hence confer a signal-to-noise advantage and probe 
the structure of the ionised gas at all depths in UCHu 
regions. We note that even at v = 5 GHz we will be 
insensitive to a population of young and compact H n re- 
gion s: the so-called Hyp er-compact H n (HCH n) regions 
(see ISewilo et al. 2011 and references therein). These 
objects have greater emission measures than UCHu re- 
gions and the turnover frequency from optically thick to 
thin occurs at high frequencies. 

No previous radio survey of the Galactic plane has sim- 
ilar resolution and coverage to the Spitzer GLIMPSE sur- 
vey. A number of sin gle dish surveys have been con- 
ducted at 5 GHz (e.g., Altenhoff et al. 19791 ). however, 
their arcminute resolution is quite low, compared to the 
arcsecond resolution of Spitzer. Most intcrfcrometric sur- 
veys have been carried out at a frequency of 1.4 GH z (e.g., 
the NRAO VLA Sky Survey. ICondon et allll998l) except 
for th e cat alogues of B ecker et al.l ~i 1994h . iGiveon et al 



(120051) and lWhite et al.l (|2005h . who surveyed the inner 
Galactic plane (-10° < I < 42°, |6| < 0.4°) at 5 GHz. 
These three surveys are published as the Multi-Array 
Galactic Plane Imaging Survey (MAGPISjE They used 
the Very Large Array (VLA) in C and D-configurations, 
which deliver a relatively large beam (4"x9") and the 
total survey area only covers 26 percent of the northern 
GLIMPSE region. 

The CORNISH (Co-Ordinated Radio 'N' Infrared Sur- 
vey for High-mass star formation) project delivers a uni- 
form, sensitive and high-resolution radio survey of the 
northern GLIMPSE region to address key questions in 
high-mass star formation, as well as many other ar- 
eas of astrophysics. In addition to UCHu regions, the 
CORNISH survey detects many other radio-bright ob- 
jects, including planetary nebulae, ionised winds from 
evolved massive stars, non-thermal emission from active 
stars, active Galactic nuclei and radio galaxies. The 
full rationale behind the survey design and the scien- 
tific motivation is presented in an accompanying paper, 
(iHoare et al.ll2012l) . 



95123 Catania. Italy 
-'-http: / /www. ukidss.org 
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Fig. 1. — A graphical illustration of the CORNISH calibration and imaging pipeline. See the text of Section [3] for more detail. 



Table 1: Details of the CORNISH observational epochs. 



Epoch 


Dates 


Dec. 


Range 


/ Range 


Config. 


Notes 


I 


2006 Jul 12 th 


-> 2006 Sep 16 th 


-10.5° - 


-++14.2° 


21.1° - 


+ 48.9° 


B 


VLA antennas only, storms. 


II 


2007 Sep 28 th 


->■ 2007 Oct 6 th 


-20.8° - 


-+-14.9° 


10.0° - 


+ 16.1° 


BnA 


VLA + EVLA antennas, low Dec. 


Ilia 


2007 Oct 27 th 


->■ 2008 Feb 4 th 


-14.9° - 


-+-10.5° 


16.1° - 


+ 21.1° 


B 


VLA + EVLA antennas. 


Illb 


2007 Oct 27 th 


->■ 2008 Feb 4 th 


+ 14.2° - 


-++29.1° 


48.9° - 


+ 65.5° 


B 


VLA + EVLA antennas. 



Note. — The properties of the data differ in the combination of antenna types included in the array, the configuration of the array, the 
weather experienced and the declination range observed. Unless otherwise noted the weather during the observations was reasonable. 



2. Observations 

CORNISH covers the 110 square degrees of the northern 
GLIMPSE region (10° < I < 65°, |b| < 1°) using the 
VLA in B and BnA configurations at 5 GHz. The com- 
bination of array configuration and observing frequency 
results in a ~ 1.5" synthesised beam within a 8.9' field 
of view, corresponding to the full-width half-maximum 
(FWHM) primary beam. With a total integration time of 
80 seconds per pointing, the root-mean-squared (RMS) 
noise in the images is better than 0.4 mJy beam -1 - suffi- 
cient to detect an unresolved UCH n region around a BO 



star on the far edge of the Galaxy (16kpc, iKurtz et al 
19941) . 



CORNISH observations of the northern GLIMPSE re- 
gion were conducted using the VLA during the 2006 and 
2007/2008 observing seasons. The observations fall nat- 
urally into the epochs presented in Table [T] which are 
distinguished by the combinations of array configuration 
used, inclusion or exclusion of upgraded EVLA anten- 
nas, declination ranges observed and weather conditions 
experienced. We show later that data from each epoch 
have unique properties. 

To facilitate scheduling the target area was divided into 
42 blocks each corresponding to eight hours of observa- 
tions per day. Block contain between 180 and 220 fields 
arranged in rows of equal right- ascension on a hexagonal 
pointing grid. Individual fields were observed as two 45 
second 'snapshots' separated by ~ 4 hours in time, max- 
imising the Mi^coverage and minimising the elongation of 
the synthesised beam. The telescope was advanced along 
each row (~ 20 fields) integrating for 45 seconds on each 
pointing position, before observing a secondary calibra- 
tor (one of 1832-105, 1856+061 or 1925+211) for two 
minutes and then continuing to the next row. Including 
overheads the secondary calibrators were observed with a 
cadence of twenty minutes. Fields at declinations greater 
than —15° were observed using the VLA's B configura- 
tion while fields at lower declinations were observed using 
the BnA configuration, which is designed to compensate 
for beam distortion at low elevations. 

To allow imaging of the widest possible field of view 
without bandwidth-smearing the observations were car- 
ried out in pseudo-spectral line mode. The two 25 MHz 
wide spectral windows (also known as intermediate fre- 
quencies, or IFs) of the VLA correlator were tuned to 
adjoining frequency bands centred on 5 GHz. Each win- 
dow was sampled by eight 3.1 MHz channels, degrad- 
ing the peak response by only a few percent at the edge 
of the 8.9' primary beam. Due to hardware limitations 
only the RR and LL polarisations were recorded, mean- 
ing that linear polarisation information is not available 
in the CORNISH data. 

During both CORNISH observing seasons significant 
engineering works were underway to upgrade the VLA 
to the next generation instrument: the Expanded VLA 
(EVLA). In 2006 between two and six antennas were 
missing from the array as they were being refurbished 
with new receivers and electronics to convert them to 



the EVLA design. By the start of the second season 
of CORNISH observations (September 2007), almost half 
the array was comprised of EVLA antennas and the in- 
strument was operating in a transition mode. Over the 
season VLA antennas were progressively removed from 
the active array and substituted by EVLA antennas. The 
EVLA antennas conferred the advantage of enhanced 
sensitivity, but at the same time were untested and prone 
to software and hardware problems. Special care was 
needed to properly calibrate VLA-EVLA baselines and 
to ensure that the EVLA data were properly flagged. As 
part of the upgrade, the venerable Modcomp-based VLA 
control systems was also replaced in mid 2007 with new 
software running under Linux. Taken as a whole, the 
CORNISH data required close inspection and vigilance 
during post-processing. 

3. The data reduction pipeline 

The raw CORNISH dataset consists of 9,349 pointing 
positions, each of which was observed twice. A manu- 
ally guided data-reduction procedure was considered too 
labour-intensive to use on such a large volume of data, 
hence a semi-automatic pipeline was developed with the 
control parameters tuned to the average observation. 
This approach has the advantage of applying uniform 
processing over the majority of the survey area, while 
still allowing manual intervention in a minority of special 
cases (e.g., fields with complicated emission structures, 
or very bright sources). 

The pipeline was implemented in the python language 
and made use of the ObitTaik module to interface directly 
with the NRACH Aipsfl and obi data-reduction pack- 
ages. The CORNISH pipeline utilised a MySQL database 
to record meta-data and perform bookkeeping operations 
during the reduction procedure. 

Figure Q] illustrates the pipeline logic, which is broken 
up into calibration and imaging stages. In the following 
sections we describe each of the stages in detail. 

3.1. Calibration and flagging 

Raw data from the telescope were corrected for atmo- 
spheric opacity using phase monitor data and written 
to an AiPS-format uv-Gle in spectral-line mode by the 
aips task fillm. Each eight-hour block of observations 
was first inspected by eye and U'y-visibilities with large 
phase scatter, errant amplitudes or system-temperature 
spikes were flagged out of the dataset. Gross errors in 
the data, such as bad antennas, IFs or polarisations were 
also flagged out at this stage. It was necessary to edit 
out the first five seconds of data from each pointing to 
allow for antenna settling time, reducing the on-source 
integration time from 45-sec to 40-sec. All manual flag- 
ging parameters were written to a master flag list, which 



3 The National Radio Astronomy Observatory is a facility of the Na- 
tional Science Foundation operated under cooperative agreement 
by Associated Universities, Inc. 

4 http:/ /www. aips. nrao.edu/ 

5 http: / / www.cv.nrao.edu/~bcotton / Obit.html 



was automatically applied upon restarting the pipeline. 
Care was taken here that the primary flux calibrators 
contained only good data. 

The shapes of the VLA and EVLA pass-bands are dif- 
ferent enough that a six percent closure error has been 
measured on EVLA- VLA baselines in continuum modes 
using 50-MHz bandwidth^. This error is expected to be 
larger at narrower bandwidths. Because we are operating 
in pseudo-spectral line mode the issue was mitigated by 
performing bandpass calibration (phase and amplitude) 
immediately after the initial flagging, and before any fur- 
ther calibration. Solutions for the atmospherically and 
electronically induced changes in phase and amplitude 
were then calculated using the standard aips calib task, 
operating on one of the three secondary calibrators. The 
data were bootstrapped on to an absolute flux scale by 
comparing observations of the quasars 1331+305 (3C286) 
or 0137+331 (3C48) to their C-band model in aips. A 
global calibration table was produced, which could then 
be applied to the whole block. 

After a first pass at calibration the obit flagging tasks 
AutoFiag and MednFiag were applied to each IF, polari- 
sation and channel of the secondary calibrator obser- 
vations. AutoFiag edits out bad visibilities based on 
an absolute maximum allowed value in Stokes / or V. 
Radio- frequency interference (RFI), e.g., from commer- 
cial broadcasting, is often highly polarised and all visi- 
bilities with Stokes V amplitude greater than 2 Jy were 
flagged as bad. MednFiag applies a rolling median filter to 
each IF and spectral channel. Visibilities were edited 
out if they had Stokes I values greater than five stan- 
dard deviations from the median, calculated in a ~ 50 
second time-window. A second pass at calibration was 
then performed before applying a similar flagging pro- 
cedure to the calibrated science observations on a field- 
by-field basis. Finally, all calibration and flagging tables 
were applied to the data and the individual forty-second 
pointings were split into wy-FITS format files. 

Meta-data associated with each observation (e.g., the 
pointing centre co-ordinates and number of flagged 
visibilities) were automatically saved in the MySQL 
database. The imaging procedure subsequently queried 
this database when building a final mosaiced image. 

3.2. Imaging 

Fields were imaged using the obit imager task, which 
performs imaging and deconvolution in a similar manner 
to the aips imagr routine, imager automatically switches 
between the standar d Cotton- Schwabb and SDI de con- 
volution algorithms (|Schwablll984llSteer et al.lll984l ). re- 
ferred to simply as 'clean' in the following discussion. 
In addition the task can be instructed to perform both 
phase and amplitude self-calibration, imager is a complex 
task with many important input parameters which need 
careful tuning to result in a scientifically useful image. 
Key amongst these are the maximum residual flux, the 
threshold at which to begin self-calibration, whether to 

See http://www.vla.nrao.edu/astro/guides/evlareturn/ for details. 



perform both phase and amplitude self-calibration, and 
the weighting function used. The 'average' CORNISH 
field was imaged using a Briggs-robustness parameter 
of zero, which is a compromise between natural (high- 
sensitivity) and uniform (high-resolution) weighting. A 
minority of complex fields were treated as a special case 
and imaged with a custom weighting scheme - see Sec- 
tion 13.2.21 below. Self-calibration was performed on 
sources with peak fluxes greater than 30mJybeam _1 . 
During the deconvolution process the restoring Gaussian 
beam was forced to be circular and have a full-width 
half-maximum of 1.5". The cell-size was set at 0.3", over- 
sampling the synthesised beam. We justify these choices 
in the following sections. 

imager implements two algorithms which improve the 
dynamic range and quality of the final image. Firstly, 
the AutoWindcraQ function dynamically places small (< 
20 pixels in radius) clean windows over regions of emis- 
sion in the intermediate dirty map. The validity of each 
window is assessed periodically during the clean cycle 
and additional windows are created as necessary. The net 
effect is to clean only real emission and avoid CLEANing 
noise, resulting in a smaller clean bias. The AutoCe «@ al- 
gorithm regrids ui^data containing bright point sources, 
so their peaks fall on a pixel centre. A point source at 
the centre of a pixel can be represented as a single delta 
function, i.e., a single clean-component, leading to a sig- 
nificant improvement in dynamic range. In contrast, a 
bright point source offset from a pixel centre requires 

7 http: / /www. aoc.nrao.edu/evla/geninfo/memoseries/evlamemoll6.pdf 
8 http: / /www. aoc.nrao.edu/evla/geninfo/memoseries/evlamemoll4.pdf 
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Fig. 2. — Fraction of recovered flux versus multiplicative fac- 
tor /mrf for an artificial source injected into the empty field 
18173-18247. The target 'maximum residual flux' (MRF) 
driving the deconvolution procedure was set to RMSv X /mrf, 
where RMSv is the root-mean-squared noise measured in 
Stokes V dirty map. The amplitude of the RMS noise in 
the image expressed as a fraction of the recovered flux is il- 
lustrated by the grey-shaded area. For factors below ~ 0.8 
the field is over-cleaned, leading to significant artefacts in the 
image. Values less than 2.0 recover greater than 96 percent 
of the flux. 
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multiple clean components, both positive and negative, 
to model its emission. This more complex model will 
inevitably suffer from rounding errors on finite-precision 
computers, leading to flux being scattered into the sur- 
rounding sky. The EVLA memos numbers 116 and 114 
by Bill Cotton contain detailed descriptions of the Au- 
toWindow and AutoCen algorithms. 

3.2.1. Controlling the deconvolution algorithm 

One of the most critical control parameters for the 
imaging task is the target maximum residual flux (MRF). 
The ideal value varies from field to field depending on the 
weather conditions, individual antenna system tempera- 
tures and the structure and strength of emission in the 
field of view. To estimate the intrinsic sensitivity at- 
tainable we imaged each field in Stokes V and measured 
the RMS noise. Sources with sign ificant circularly po- 



larised emission at 5 GHz are rare (Roberts et al. 1975 



Homan fc Lister 20061 ). so the RMS noise measured from 
an uncleaned Stokes V image is expected to be compa- 
rable to the final CLEANed noise level. Because the two 
polarisation beams of the VLA are not co-aligned on the 
sky they give rise to a strong instrumental polarisation 
away from the pointing centre. To compensate for this 
'beam squint '[£] only the central 2' diameter portion of 
each field was imaged and measured. Despite this we ex- 



'http://www.aoc.nrao.edu/evla/geninfo/memoseries/evlamemoll3.pdf 



pect noise in the Stokes V images to be higher than the 
ideal in their Stokes / counterparts. In addition, the 1.5" 
restoring beam applied to the / images is larger, on aver- 
age, than the unconstrained synthesised beam of the V 
images. This effectively smooths the noise in the Stokes / 
images compared to the V. 

The target MRF was assumed to be equal to RMSy x 
/mrf, where /mrf is a constant multiplicative factor. 
The canonical value for /mrf was determined in two 
ways. Firstly, artificial point sources were injected into 
the wv-data for an emission-free field and the obit imager 
task was applied using a range of values for /mrf- After 
each iteration the recovered flux and RMS noise of the 
final image was measured. Figure [5] shows the results 
of this experiment. We found in practise that values of 
/mrf = 0.8 — 2.0 recovered greater than 96 percent of the 
flux, within the errors. Secondly, we chose a representa- 
tive sample of compact sources with simple morphologies 
and inspected the deconvolved images for residual side- 
lobe structure. Values of /mrf = 0.8—1.0 were required 
to fully remove sidelobe structure from the images. We 
re-imaged fields using a higher threshold where obvious 
clean artifacts were present. Figure [3] shows the values of 
/mrf used across the survey. Most of the CORNISH area 
was imaged using an /mrf = 0.8. Fields at lower decli- 
nations (including the BnA observations) required higher 
/mrf values to avoid producing increased numbers of 
low- level artefacts in otherwise empty regions. Unre- 



solved weak detections are dominated by the extragalac- 
tic population and hence have a flat distribution on the 
sky (jAnglada et al.lll998l see also Sections Qland l6.2.2j) . 
Based on an initial pass at the data reduction we used 
two further levels of /mrf = 1-0 and 1.1, chosen to keep 
the number of 5 - 6cr point-sources roughly constant away 
from the Galactic mid-plane. However, multiplier val- 
ues between 0.8 and 1.1 lead to imaging artefacts in a 
minority of fields affected by poor calibration, contain- 
ing very bright point sources (>Uy) or extended emis- 
sion. Such fields were inspected and cleaned manually. 
A small proportion of fields 1 percent) were found to 
contain significant circularly polarised flux and were also 
cleaned by hand. It is likely that this emission is due to 
the instrumental beam squint, rather than real emission 
on the sky. The obit imager task does not correct for 
beam squint, however, tests on selected CORNISH data 
did not find any believable Stokes V after a correction 
had been applied (B. Cotton, private communication). 
The manually cleaned fields are mostly confined to high- 
mass star-formation regions. They appear on Figure[3]as 
patches of red hexagons clustered around the mid-plane 
of the Galaxy. Values used in these cases ranged over 
1-2 < /mrf < 10, with a mean of 3.2. 

Applying the deconvolution algorithm close to the 
noise can result in an increase in the so called 'clean 
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Fig. 4. — Top panel: Fractional recovered flux as a func- 
tion of full-width half-maximum (FWHM) for an artificial 
Uybeam -1 peak Gaussian source injected into an empty 
field. Data imaged with a Gaussian smoothed weighting is 
missing greater fractional flux compared to data imaged with 
robust = weighting. Bottom panel: Root-mean-square noise 
as a function of FWHM for 1 Jy flux density Gaussian sources. 
The images made using the Gaussian smoothed weights tend 
to have lower and more stable noise properties. 



bias', an effect which results in a systematic reduction 
in object fluxes. It is believed to be caused by inadver- 
tently CLEANing bright sidelobes, leading t o a subtrac 



tion o f real flux from astr onomical objects (jWhite et al 



Il997t ICondon et al. 1998). We measure the clean bias 
for CORNISH images in Section [5^1 

3.2.2. Imaging extended emission 

Baseline lengths on the VLA B and BnA arrays range 
from approximately 300 kA to 2kA, equivalent to spatial 
scales of 1.5" to 2', respectively. These array config- 
urations sample the wu-plane less well at shorter spac- 
ings and the deconvolution algorithm has difficulty recon- 
structing image structure on scales greater than ~ 14". 
The imaging procedure tends to produce 'waves' or 'ruf- 
fles' in the background of fields containing significant ex- 
tended emission. Flux is also scattered over the image 
as the standard clean algorithm attempts to model the 
emission as a series of delta functions. This can lead to 
high RMS noise levels and multiple imaging artefacts, 
especially if self-calibration is allowed to run unchecked. 
A total of 193 fields (~ 2 %) were found to have poorly- 
imaged extended emission. To combat this problem we 
imaged these fields using a custom weighting scheme. 
By default the C ORNISH pipe line is configured to use 
robust weighting ( Briggsl 1995 ). which is a compromise 
between the low thermal noise of natural weighting and 
the high resolution of uniform weighting. In the case 
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Fig. 5. — Top panel: Fractional absolute residual flux re- 
maining after subtracting a model Gaussian source from the 
test image. Low values mean that the image is similar to the 
model while high values mean that there are significant dif- 
ferences in morphology. Bottom panel: Fitted versus injected 
FWHM. Sources with FWHM >14" are poorly imaged by the 
VLA B-array. 




of uniform weighting the obit imager weights each visi- 
bility by the sum of the weights present in each cell in 
the MiJ-plane. For fields with extended emission we have 
instead weighted by the inverse of the number of visibil- 
ities within a radius of ten cells, attenuated by a Gaus- 
sian function. For the B and BnA arrays this has the 
effect of weighting down the poorly sampled short spac- 
ings, similar to the effect of applying an inverse taper. 
We found that the RMS noise and number of artefacts 
in the imaged fields are reduced at the expense of addi- 
tional 'missing' flux. The reduction in flux compared to 
a uniform, or robust weighted image is highly dependent 
on the structure of the emission. 

In order to quantify the effect of the two weighting 
schemes (robust = and Gaussian) we imaged artificial 
Gaussian sources of increasing size inserted into the uv- 
data for a blank field. The peak flux was fixed at 
l.OJybeam -1 while the full- width half- maximum was 
increased from 2.0" to 26" in steps of 2". Figured] {top 
panel) plots the recovered flux as a function of the source 
FWHM for both weighting schemes. When imaging us- 
ing robust weighting, the fraction of recovered flux drops 
off above FWHMs greater than ~ 8". When using the 
Gaussian-smoothed weighting scheme this drop-off also 
occurs at FWHM > 8", however, the fraction of recov- 
ered flux falls more rapidly. The bottom panel of Figured] 
shows the RMS noise as a function of injected source size. 
In this case the flux density was fixed at 1.0 Jy to avoid 
being dynamic-range limited at higher fluxes. It is clear 
that the RMS noise in the Gaussian weighted images 
is significantly lower and more stable as a function of 
FWHM. In practise dynamic ranges of several thousand 
are achieved on isolated point sources, falling to several 
hundred for slightly resolved sources (> 1.8arcsec). 

Figure [5] shows the effect of the two weighting schemes 
on the fidelity of the images. In the top panel is plot- 
ted the absolute fractional value of the residual flux re- 
maining after the model image is subtracted from the 
pipeline imaged data (|5 , re sid|/<S'modei)- Lower numbers 
mean that the image is similar to the model, while higher 
numbers mean that there are significant structural dif- 
ferences. It can be seen from the plot that the Gaussian 
weighting scheme is the most consistent at representing 
the source morphology, while the robust weighted images 
break down between FWHM = 8" and 10". To further 
quantify the effect we fit the pipeline imaged data with 
a 2D-Gaussian using the miriad task imfit. In the bot- 
tom panel of Figure [5] is plotted the fitted versus injected 
FWHM. It is again clear that the robust weighted images 
begin to differ from the model at FWHMk 10", while the 
Gaussian weighting scheme preserves structures out to 
~ 14". Note that real-world emission with complex mor- 
phology will react differently to the Gaussian weighting 
scheme, depending on its visibility function. 

3.3. Mosaicing 

All fields were imaged out to a radius of eight arcmin- 
utes (~ 10 percent power pattern) before being linearly 
mosaiced in the image plane onto 20' x 20' tiles ori- 



ented in equatorial (J2000) coordinates. In total 1408 
tiles cover the survey area, each of which overlap by 
1'. Pointing centres sit on a close-packed hexagonal grid 
adapte d from the 1.4 GHz NVSS survey and scaled to 
5 GHz. Condon et aL J 1998h justifies this layout in detail 



and iHoare et al.l ([20121 ) describes the implementation in 
CORNISH. Here we provide a summary for convenience. 
Adjacent CORNISH pointing centres are separated by 
7.4', compared to the 8.9' full-width half-maximum of 
the primary beam at 4.86 GHz. The separation is opti- 
mised to maximise the uniformity of the noise pattern 
without appreciably degrading observing efficiency. At 
any point in the mosaic the sky brightness B is given 
by a weighted sum of the individual brightness values 6j 
contributed by the overlapping snapshots 
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To maximise sensitivity the weighting factor Wi was set 
to be proportional to P(p), the primary beam pattern as 
a function of offset p from the pointing centre. This cor- 
rection is necessary as the noise is constant across a raw 
snapshot image and must be weighted by the square of 
the signal-to-noise-ratio. The weighting method is imple- 
mented in the CORNISH pipeline in two steps. Individ- 
ual fields are first multiplied by P(p) and summed onto 
a blank tile. This image is then divided by a 'weight im- 
age' created from the sum of P 2 (p) functions (modelled 
by Gaussians for the VLA). The resultant data prod- 
uct is a mosaiced image which has been primary beam 
corrected (i.e., divided by P{p)). Figure |5] illustrates 
an example of a weight image. The minimum weight is 
P 2 (p) = 0.83, hence the worst-case relative-sensitivity is 
V / P 5 (p)=0.91. 
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- Example of a weight image used in the mosaicing 
The greyscale depicts the squared beam patterns 
P 2 (p) accumulated onto a 20' x 20' tile. The green circle 
shows the extent of a single field and the dot its pointing 
centre. 
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Fig. 7. — Images of the secondary calibrators used to perform phase tracking. All three quasars exhibit jets, while 1925+211 has 
an extended jet and there are several sources brighter than 1 mjy within one arcminute. The image scales have been stretched 
to show all real emission, but also highlight very low-level imaging artefacts. Only clean-components from real emission were 
used when calibrating the data. 



4. Data quality 

Data were reduced and imaged for quality control pur- 
poses immediately after the observations were completed. 
Bad data were quickly identified allowing the affected 
fields to be rc-schcdulcd in the observing queue. The 
rapid turn-around time meant that we were able to re- 
observe most fields affected by poor weather in 2006 and 
system power glitches in 2007/2008. 

4.1. Calibration 

Three quasars, spaced equally along the plane of the 
Galaxy, were used as secondary (phase) calibrators for 
the whole survey. Although initially assumed to be point- 
like, we found that each exhibited structure at the 0.1 to 
2.0 percent level, in the form of radio-jets and nearby 
confusing sources. Using the full complement of data 
available we imaged and self-calibrated each secondary 
calibrator field out to the full-width half-power radius. 
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Fig. 8. — Top panel: Plot of the percentage deviation from 
the median flux density versus block number for the three 
secondary calibrators. Each point represents the integrated 
flux density measured from an eight hour observing block. 
The scatter is mostly within five percent. Bottom panel: The 
same plot as above except for the backup primary calibrator. 



The resulting clean-component models were used as in- 
puts to the calibration procedure. Images of the sec- 
ondary calibrators are presented in Figure [7j Quasars 
1832-105 and 1856+061 deviate from point sources, ex- 
hibiting jets with flux densities peaking at two percent of 
the main peak. The source 1925+211 shows significant 
structure within one arcminute of the central source, in- 
cluding an elongated jet and two point sources of 1 mjy 
and 6 mjy (0.1 and 0.4 percent of the main peak, respec- 
tively). 

Two primary flux density calibrators were observed, 
providing a redundant means of flux-calibrating the data. 
1331+305 (3C286) was observed at the beginning of an 
observing block and 0137+331 (3C48) at the end. With a 
5 GHz flux density of 7.47 Jy 1331+305 was the preferred 
calibrator. However, 0137+331 (Ssghz = 5.48 Jy) was 
used if technical or weather-related problems affected the 
initial data from a block. 

The small number of calibrators observed allowed us 
to check the consistency of our calibration with time. 
Figure [8] shows the percentage deviation from the me- 
dian flux densities of the three secondary calibrators 
and the backup primary calibrator. Each point on the 
plot represents an 8-hour block of observations. Cal- 
ibrator flux densities were measured directly from the 
image data by manually drawing a polygon around each 
quasar and summing the flux within the polygon. For 
the secondary calibrators the standard-deviation in flux 
is 2.7 percent, and for the backup calibrators 8.9 percent, 
consis tent with the accur acy of previous VLA surveys 
(e.g., ICondon et al. 1998 . who quote three percent at 
1.4 GHz). No variation with time is seen, implying that 
the calibration is stable over the two observing seasons. 
The scatter in the backup calibrator is a more appropri- 
ate error to quote for snapshot imaging and is adopted 
as the formal amplitude calibration error for CORNISH 
data. 

All CORNISH observations are phase-referenced to one 
of the three secondary calibrators and hence adopt their 
positional uncertainties. The formal positional uncer- 



tainties may be found in the VLA calibrator manua 
and are < 150milliarcseconds (mas) for 1832—105, < 
10 mas for 1856+061 and < 2 mas for 1925+211. 

4.2. Synthesized Beam shape 

The dual-snapshot observing scheme was designed to 
deliver the most circular synthesised beam possible, while 
allowing both snapshots to be taken within a single eight- 
hour observing block. To minimise the total range of syn- 
thesised beam shapes in the survey each field should ide- 
ally be observed at an equal ±3 hr hour-angle before and 
after its zenith position. Scheduling constraints meant 
that this was not achieved in practise and a compromise 
of four hours betwe en snapshot images was implemented. 
Hoare et al.l (|2012l ) presents the parameters of the syn- 
thesised beams attained in the final images, which we 
briefly summarise here. 

Within each observing block the beam elongation in- 
creases towards lower declinations, while the position an- 
gle varies by ~ 60 degrees. The distribution of beam 
minor- axes in the survey area separates into two distinct 
populations, with a small peak at 0.77" and a large peak 
at 1.2". The smaller peak stems from the low-declination 
fields observed using the BnA array configuration, while 
the larger one contains the majority of fields observed 
using the B array. In contrast, the distribution of ma- 
jor axes values is monolithic, with a median at 1.5" and 
a standard deviation of 0.32". Ninety-eight percent of 
fields have elongations less than two and seventy-four 
percent less than 1.5. 

Based on these values, we chose to force a circu- 
lar restoring beam of FWHM 1.5" because this greatly 
simplified the mosaicing operation and meant that the 
restoring beam shape was constant across every mosaiced 
image. The value 1.5" was chosen as the median value of 
the measured major-axes from all CORNISH fields. The 
degree of super-resolution is presented in Figure |H] and is 
less than 1.5 in ninety-six percent of fields. The restor- 
ing beam area is larger than the synthesised beam area 
for 8,154 fields (87.2 percent) and is less than 1.5 times 
greater in 9,343 fields (99.9 percent). 

4.3. Sensitivity and uniformity 

Figure [10] presents an image of the RMS noise over the 
full survey area, with each colour-coded hexagon repre- 
senting a field. The locations of H u region complexes are 
prominent as clumps of high-noise fields located close to 
the mid-plane of the Galaxy. Away from such regions the 
noise level within individual scan-rows (scanning in RA) 
is relatively constant compared to the variation between 
rows, which is largely weather related. The observa- 
tion area can be divided into two regions with noticeably 
different noise properties. At declinations greater than 
6 = 14.2°, the median RMS noise is significantly lower 
(RMS ou tcr = 0.25 mJy beam -1 ) than the remainder of the 
survey area (RMSi, m or = 0.35 mJy beam -1 ). This outer 
CORNISH region corresponds directly to the epoch-IIIb 



observations detailed in Table [JJ From the 2007 season 
onwards the VLA made extensive use of the upgraded 
EVLA antennas, which have more sensitive receivers. In 
addition, the weather conditions were better in the sec- 
ond season than in 2006, when observations were affected 
by electrical storms. Observations of approximately the 
inner 20 degrees of the CORNISH area (5 < -10.5°) also 
took place during the second season, corresponding to 
epochs II and Ilia. However, the RMS noise level is sim- 
ilar to the 2006 season for a number of reasons. In partic- 
ular, the inner CORNISH region is seen at relatively low 
elevations from the VLA site, requiring the telescope to 
peer through a greater path-length of atmosphere. Emis- 
sion from the atmosphere causes an increase in system 
temperature decreasing the signal-to-noise ratio in the 
data. The epoch-II observations utilised fewer EVLA an- 
tennas and, because the telescope was at the beginning 
of the VLA/EVLA transition, required extensive flagging 
to render the data usable. FigureHJJpresents a histogram 
of the distribution of noise measurements, sampled on 2' 
scales, across the whole survey. The division between 
the inner and outer CORNISH regions is obvious. Both 
regions exhibit high-noise tails, corresponding to fields 
containing bright and extended emission. 

4-3.1. Spatial scale of noise 

Interferometry data often exhibit non-Gaussian noise 
statistics, largely due to the non-linear deconvolution 
process and poorly sampled -uw-coverage at large spa- 
tial scales. In regions with complex structures on scales 
greater than ~ 14" the emission is poorly constrained by 
ifi>-coverage of the VLA B arrays. If only a few short 
baselines contain most of the flux a simple fringe pattern 
is produced on the sky. The flux is not evenly distributed 
but accumulates at specific spatial scales, depending on 
the sampling in iw-space. The deconvolution algorithms 
used here also struggle to model this emission, resulting 
in some of the flux being scattered onto the surrounding 
sky (see Section |3~2|) . It is important to characterise this 
'ripple' noise pattern before attempting to search for real 
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Fig. 9. — Histogram showing the distribution of super- 
resolution caused by forcing a 1.5" circular restoring beam 
for all pointing positions. 
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Fig. 10. — Map of the RMS noise in each field of the CORNISH survey. Striping in right ascension is due to changes in 
observing conditions between scan-rows. Clusters of high-noise fields (red) occur at the locations of star-forming complexes, 
which contain bright and extended emission. 



emission in the CORNISH data. 

We have measured the noise characteristics of repre- 
sentative CORNISH data affected by a ripple. The re- 
gion chosen was centred on a — 18 /l 09 m 21.96 s , 6 = 
— 20°19'34.9" and the RMS noise was measured using 
both the standard-deviation (STDEV) and median ab- 
solute deviation from the median (MADFM) statistics. 
For a dataset X = xi , X2 . . . Xj . . . x„ MADFM is given 

by 

fj = K median (|xj — median(X)|), (2) 



i.e., the median of the deviations from the median value. 
For a normal distribution MADFM is equivalent to the 
standard deviation using a scale factor K= 1.4826. The 
advantage of MADFM is that it is insensitive to the pres- 
ence of outliers in the distribution and delivers a robust 
estimate of the true noise. Measurements were conducted 
using a range of aperture sizes, varying between 12" and 
240" in steps of 2.82". In total twenty one positions were 
measured, offset in declination by 6" along a line centred 
on the noise peak. The scatter in the results (expressed 
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Fig. 11. — Distribution of the total noise over the survey 
region made by sampling on 2' scales (solid histogram). The 
right-slanted histogram (blue) contains only data from the 
epoch Illb observations (8 > 14.2°) while the left-slanted his- 
togram (red) contains the remainder (epochs I, II and Ilia). 
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Fig. 12. — Results of repeated noise measurements per- 
formed using a range of square apertures on CORNISH data 
containing a ripple. The y-axis plots the scatter in the ensem- 
ble set of measurements as a function of aperture size. It is 
clear the MADFM values are robust for apertures with scales 
greater than 40". 



in standard deviations a) for each aperture size is plotted 
in Figure [12J From the plot we see that the scatter in the 
ensemble set of measurements increases as the aperture 
size decreases. The MADFM statistic remains stable at 
smaller spatial scales than the STDEV. At scales less 
than 2' the scatter in the STDEV measurement slowly 
rises, compared to MADFM, whose scatter remains less 
than 0.1 mJy beam -1 until scales of 40". Measurements 
of the global noise-properties of the CORNISH data are 
therefore best performed using apertures spanning 40" 
or larger using the MADFM statistic. 

5. The CORNISH source finder 

We have developed an automated source finding pro- 
cedure with the aim of producing a well-characterised 
catalogue of 5 GHz emission in the northern Galactic 
plane. In the following subsections we describe the 
source-finding and measurement procedures and inves- 
tigate the limits of the catalogue. 

5.1. Source detection and photometry 

Tiles were automatically searched for emission using a 
custom procedure based on the obit FndSou task. FndSou 
identifies contiguous islands of emission above a global 
intensity threshold and attempts to fit one or more 2D 
Gaussians to each. This approach works well in the sim- 
plest case of an image with homogeneous noise proper- 
ties, however, in the worst-case scenario the RMS noise 
can change by a factor of a few over a 20' x 20' tile. This 
is especially true of tiles covering the Galactic mid-plane, 
where massive star-forming complexes are common. Us- 
ing a single intensity threshold often results in spurious 
detections or omissions of real sources. To compensate 
for variable noise levels we ran FndSou on a 9x9 grid of 
'patches' within the tile area. Each patch is 800 pixels 
(4') on a side and overlaps adjacent patches by 400 pix- 
els in R.A. and Dec. The local RMS noise in each patch 
was determined using a histogram analysis clipped at 3(7 
from the median value. With this patch layout a radio 
source within a 2' band around the tile edge may be de- 
tected in two patches, except at the tile corners. A source 
in the interior may be detected in up to four overlapping 
patches. The maximum fitted Gaussian FWHM was con- 
strained to be < 30" in keeping with the ww-coverage. 
Fits within 14" of the patch edge were deemed invalid, 
except where a patch abutted a tile edge. Running this 
patch-based emission finding procedure results in a de- 
generate list of sources with coincident positions derived 
from overlapping patches. A list of unique Gaussian fits 
to each tile was produced by searching for duplicates at 
similar positions (separation < 1") and with similar peak 
amplitudes (A min /A meix > 0.7). The Gaussian fit closest 
to the centre of a patch was retained. 

Initially, the search was conducted using a 4er lo- 
cal noise threshold and aperture photometry was per- 
formed to weed out detections with a signal-to-noise ra- 
tio a < 5.0 (ci = maximum pixel/RMS-noise). An ellip- 
tical aperture was used to measure the source properties, 



which extended to the 3er Gaussian major and minor axes 
(2.548 x FWHM). If the emission was indeed Gaussian in 
shape this aperture would encompass 99.7 percent of the 
emitting flux. The RMS noise and median background 
level of the sky were measured from a 20" wide annulus 
centred on the source and offset from the measurement 
aperture by 5". The annulus width was chosen to sam- 
ple the local noise pattern without being influenced by 
ripples or negative-bowls (see Section 14. 3. ip . In crowded 
regions the sky annulus is likely to contain bright and 
real sources so the noise was measured using the robust 
MADFM statistic. The parameters of the valid Gaussian 
fits and photometric measurements were both recorded 
to the MySQL database, although the Gaussian fits are 
preferentially used in the default CORNISH catalogue. 

5.2. Resolved emission 

The source finder determines accurate fluxes for iso- 
lated and unresolved sources but decomposes complex 
structures into multiple overlapping Gaussians fits. It is 
highly desirable to merge these into a single measurement 
to avoid over-interpreting the number-counts and prop- 
erties of sources in the final catalogue. Clusters of Gaus- 
sians were identified in the catalogue using a friends-of- 
friends search: a Gaussian was associated with a cluster if 
it was within 12" of any other member. In total, 741 clus- 
ters were found and these were all inspected manually. 
To distinguish between adjacent but unrelated sources 
and over-resolved emission the morphology at 5 GHz was 
compared to that in the Spitzer GLIMPSE mid-infrared 
images. The most common extended sources in the im- 
ages are UCHii regions and planetary nebulae, each of 
which have distinctive mid-infrared signatures. For these 
types of object the morphology of t he 8 /xm emission o ften 
echos that of the radio continuum (jHoare et al.ll20071 ). If 
a cluster of Gaussian fits was found to trace an over- 
resolved source then the fitted parameters were replaced 
with a single measurement under a polygonal aperture 
manually drawn around the emission. Figure[l3]shows an 
example of a polygon carefully drawn around the border 
of a cometary Hn region. The flux density is calculated 
from the sum of the pixels within the source aperture 
minus the median background level in the vicinity of the 
source. In addition to the coordinates of the peak emis- 
sion (for which the source is named in I and b) we also 
record the geometric and intensity-weighted positions. 

5.3. Measurements and uncertainties 

Below we explain how the properties of the sources 
were measured and the uncertainties calculated. The fi- 
nal values are presented in the CORNISH catalogue, in- 
cluding the measurement-error and the absolute uncer- 
tainty on each parameter, incorporating the calibration 
error of 8.9 percent. 

5.3.1. Gaussian fits 

Uncertainties on the Ga ussian fit s are c alculated us- 
ing the equations derived bv lCondon ( 1997 ). summarised 
here for convenience. Noise in interferometric data is 
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Fig. 13. — Left panel: Example of a polygonal photometry aperture drawn around a spatially extended CORNISH detection, 
in this case the UCHlI region G043. 8894— 00.7840. The integrated flux density is calculated from the enclosed pixels minus an 
average background flux measured from the sky in the vicinity of the source. A cross marks the intensity-weighted position. 
Right panel: The GLIMPSE 3-colour image (red = 8.3 pm, green = 4.5 pm, blue = 3.6 /im) exhibits an overall morphology similar 
to that of the radio emission. 



correlated on the scale of the synthesised beam FWHM, 
in this case 9b m = 1.5". The effective signal-to-noise 
level p of a source with measured peak amplitude Apeak 
seen against a background of correlated Gaussian noise 
is given by 



bin 



i 
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peak 
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<4y 



(3) 

where 9m and 9 m are the respective major and minor 
fitted axes and cr s k y is the RMS noise measured directly 
from the im age. The expo nents cvm and a m have been 
estimated bv lCondonl(ll997li via Monte-Carlo simulations 
and are cum = cv m = 3/2 for the amplitude and flux 
density errors, «m = 5/2, a m — 1/2 for the error on 
the major axis, and cvm = 1/2, a m = 5/2 for the minor 
axes, position angle and absolute coordinate errors. On 
average, the signal-to-noise ratio is increased by a factor 
of 1.4. The positional uncertainties parallel to the major 
(ctm) aud minor (<r m ) fitted axes are given by 
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using values for p calculated from Equation [3] When the 
fit is projected onto equatorial axes the absolute posi- 
tion errors in right ascension (a a ) and declination (as) 
become 



cr^sin^P.AO + ^cos^P.A.), 



a m sin 2 (P.A.), 
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(5) 



where P.A. is the position angle of the fitted major axis 
east of north and e Q = e$ « 0.1 arcseconds is the system- 
atic positional uncertainty. This value was determined 
via a comparison between CORNISH and catalogues of 
quasars whose positions are determined to milliarscsec- 
ond accuracy. The 15 matching quasars were drawn 



from the Goddard VLBI astromctric catalogue Very 
Long Baseline Ar ray Galactic Plane Survey (VGaPS, 
Petrov et al.ll2011 ) and the VLA-calibrator manual and 
their median offset of 0.1 arcseconds was adoped as the 
systematic positional uncertainty for CORNISH. 
Errors in the P.A. may be calculated from 
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although we note that position angle values are only rel- 
evant when one or more axes is significantly resolved. 
Uncertainties associated with the fitted major and mi- 
nor Gaussian FWHM are given by 
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The fractional calibration uncertainty eg = 0.02 is 
adop ted from the VLA NVSS survey ( Condon et al.1 
1998), which was observed using a similar snapshot 
mode. A single characteristic measured angular size 9{ 
may be obtained from the geometric mean of the major 
and minor axes 

Of = V #M #m, 



(8) 



and its associated measurement uncertainty given by 
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The restoring beam was forced to be a circular Gaussian 
of FWHM ^bm = 1.5" over the whole survey area so the 
deconvolved source size 9 S in arcseconds may be found 
from 

'? f 2 -l-5 2 , (10) 



1:L http: / /gemini. gsfc.nasa.gov/solutions/ 

12 http: / / www.vla.nrao.edu/astro/calib/manual / 



although we note that detections with 9{ < 1.8" are con- 
sidered unresolved in CORNISH. 

The fitted amplitude A pea k must be corrected for the 
clean bias A^4 c b = — 0.94cr s k y , which we measured in 
Section [5?6l below, so 
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(11) 



The uncertainty on the fitted amplitude may be calcu- 
lated from 

9 A 2 

-^ + c 2 aA 2 , (12) 



where = 0.089 is the fractional amplitude calibration 
error. Finally, the integrated flux density S under a 2D 
Gaussian is given by 
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and the corresponding uncertainty is 
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5.3.2. Aperture photometry 

The peak amplitude reported for sources measured us- 
ing aperture photometry is simply the intensity of the 
brightest pixel within the source aperture, corrected for 
the clean bias. 



A — A ma v A.A 



cb- 



(15) 



The effective signal-to-noise of the source in the presence 
of correlated Gaussian noise may be determined from a 
modified version of Equation [3] in which 9m and 9 m are 
both replaced with the intensity-weighted diameter 
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where ^» ^ s the total flux in the source aper- 

ture summed over N src pixel elements (in units of 
Jy pixel -1 ), B is the background flux density esti- 
mated from the median level in the sky-annulus and 
flbm = 7r #bm/(4^ n (2) Ap ix ) is the beam-area in pixels 
(28.33 pixels for CORNISH data). S phot must also be 
corrected for clean-bias. If the source is unresolved the 
missing flux A5 c b is given by 
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(20) 



however, because the clean-bias reduces the flux in all 
clean-components (CCs) by a constant factor the effect 
on extended emission is difficult to gauge. The minimum 
number of CCs required to model an extended source 
can be estimated from the number of beam-areas ribeams 
subtended by the emission. The integrated flux density 
is then 

S = Sphot — AScb * Jlbcams- (21) 

The error on the integrated flux density may be found 
from 



4 




(22) 



where cr 2 is the corrected variance, N src and iV s k y are the 
the number of pixels in the source- and sky-apertures, 
respectively. The term cr(^Aj) is the uncertainty on the 
sum over the pixels in the source aperture given by 
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The uncertainty on the intensity weighted diameter 
maybe found from 



cr 2 (9 d ) 
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(24) 



For a perfectly circular Gaussian source 9d = 9{ = 
FWHM. In Equation 1161 r, is the angular distance from 
the z th pixel to the brightness-weighted centre and Aj is 
its intensity. The sky-noise corrected for Gaussian cor- 
relation is then 
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(17) 



where a = 3 for all errors. The uncertainty on the peak 
amplitude is 

°A=°l + 44 2 „a X , (18) 

where £a = 0.089 is the calibration error. 

The equation for measuring the integrated flux den- 
sity S'phot of a source using aperture photometry can be 
written as 



S'phot — 





(19) 



where the sums are taken over the pixels in the source- 
aperture. For emission measured using a polygonal aper- 
ture the intensity-weighted position is given by 
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(25) 



where Xi is the right-ascension (a) or declination (d) in 
the i th pixel. The corresponding error in x is given by 
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(26) 



with e x — e a = e$, the absolute positional error of the 
associated phase calibrator. 

5.4. Spurious sources 

We have attempted to estimate the number of spu- 
rious sources detected in well calibrated and well be- 
haved data by running the source finder on inverted tiles, 
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Fig. 15. — Density map of all CORNISH sources detected above 5a. The greyscale level illustrates the number of sources found 
within an 8 arcmin radius of any position. Away from complexes of Hll regions, which are mostly near the Galactic mid-plane, 
variations in the source counts are due to spurious sources, e.g., the bad scan-row at I « 14°. 



i.e., tiles where the pixel values have been multiplied 
by —1. Any negative detections will be false and al- 
low us to estimate the number of spurious sources as 
a function of the signal-to-noise ratio. Fourteen tiles 
were selected to be representative of the emission prop- 
erties across the survey region. They contained vari- 
ously: no strong emission, one or more point sources with 
S5 GHz > 50 mjy, weak extended emission, and bright ex- 
tended sources causing moderately elevated noise levels 
(0.5 mJy< RMS <0.8 mjy). For comparison, we ran the 
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Fig. 14. — The solid histogram illustrates the cumulative dis- 
tribution of spurious detections in the CORNISH catalogue 
as a function of the signal-to-noise ratio measured from four- 
teen inverted tiles. The plot has been scaled to the total 
CORNISH area (~ 1300 tile areas) by multiplying by 81. 
In comparison, the hatched histogram shows the distribution 
of detections from the same tiles before inversion. See Sec- 
tion [ST?] for further details. 



source finder using a 3.5(7 cutoff on both the inverted and 
the regular tiles. Figure [14] plots the cumulative counts 
of detections as above a signal-to-noise ratio, expressed 
as a. The grey-shaded histogram illustrates the num- 
ber of spurious detections in the inverted tiles, while the 
hatched histogram illustrates the detections in the nor- 
mal tiles, some of which will be real. Note that below 
4.5cr the detections are dominated by spurious sources. 
The fourteen tiles represent 1.23 percent of the survey 
area, so by scaling the plot by 81 we can estimate the to- 
tal distribution of spurious sources in CORNISH. In Fig- 
ure [H] the number of spurious sources found decreases 
to 81 at 6.1cr, above which our scaling is too crude to 
sample. For populations governed by Gaussian statis- 
tics the fraction /(<x) of the populations lying within a 
er-thrcshold is given by 

f(a) = 1 - erf(a/V2), (27) 

where erf (a) is the Gaussian error function. The solid 
black curve in Figure IT4l plots /(er) assuming the total 
number of possible detections is equal to the number of 
synthesised beam areas in CORNISH 5.6 x 10 8 beams). 
It is clear that this assumption underestimates the num- 
ber of spurious sources found, so we fit f(a) with the 
total number of sources as a free-parameter. The fit is 
shown by the short-dashed line and is dominated by the 
large number of sources in the bins with a < 4. Above 4a 
the distribution has a shallower fall-off than expected for 
Gaussian statistics. An alternative is shown by the long- 
dashed line, which is fitted only to the bins with a > 4 
and uses the error-function of a distribution with a nar- 
rower width than purely Gaussian (a = 0.9(7 gauss ). It is 
a significantly better match to the high signal-to-noise 
end of the distribution and predicts less than one spuri- 



ous source above 7a. Based on this reasoning we have 
chosen a signal-to-noise cutoff of 7a for a high-reliability 
CORNISH catalogue. We caution that data with greater 
complexity or poor calibration may introduce significant 
numbers of false sources above this level, so this does 
not mitigate the need to manually inspect the data for 
artefacts. Sources detected below 7a are not offered as 
an official data-product, but this low-reliability catalogue 
will be made available on the CORNISH web page 

5.4-1- Density of low signal-to-noise sources 

A density map of the CORNISH detections serves to 
highlight regions containing excessive numbers of weak 
sources, some of which may be spurious. Figure [15] 
plots the number density of CORNISH sources above 5a 
summed within an eight arcminute radius. The most 
prominent feature is a line of elevated pixels correspond- 
ing to the scan row at 8 = -16°55'03.13" (13.3° < I < 
14.5°). This row is unique in the survey as each field 
has only a single 40 second snapshot observation. Data 
in the second pass were found to be corrupted and no 
repeat observations were scheduled due to an array con- 
figuration change. 

Isolated clumps of pixels with high source counts (e.g., 
at I = 12, I = 31, I = 43 and I = 49) correspond to 
molecular cloud complexes forming massive stars. The 
over-density of weak sources in these regions could be 
due to either a real increase in source counts or to an 
increase spurious in sources generated by the deconvolu- 
tion process. Both scenarios warrant careful inspection 
of the data. 

5.5. Completeness 

To quantify the formal sensitivity limits of the pipeline 
reduced data we conducted completeness tests on tiles 
from 'empty' parts of the sky. The tiles were chosen to 
have few detections above 5a, homogeneous noise prop- 
erties and be free of obvious imaging artefacts. One- 
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Fig. 16. — Percentage completeness as a function of flux 
density for point sources within eight representative tiles. See 
the text in Section \5. 51 for further details. 



hundred artificial point sources were injected into the cal- 
ibrated wy-data for each tile before creating a mosaiced 
image. The flux densities of the injected sources were var- 
ied randomly between 0.5 and 5.0 mjy, so as to bracket 
the expected sensitivity limit. Positions were also cho- 
sen randomly, but avoided known emission, the tile edges 
or regions where noise spikes were common. FndSou was 
then used to find and fit the emission with Gaussians. 
After twenty iterations of the injection-imaging-fitting 
routine the aggregate results were compared to the in- 
jected source parameters. 

Figure [Tj)] plots the percentage of sources recovered as 
a function of the flux density for tiles covering a range of 
RMS values and Epochs. The image parameters are pre- 
sented in Table [2] alongside the fifty and ninety percent 
completeness limits. As expected, tiles with lower mea- 
sured RMS noise levels tend to have lower completeness 
limits. There is, however, significant variation between 
tiles as the local completeness limit ultimately depends 
on the uniformity of the noise pattern within the tile. 
At worst (tile 529, Epoch I) the CORNISH survey is 90 
percent complete to point sources at the 3.9 mjy level. 

5.6. Clean bias 

When deconvolving the synthesised beam from the im- 
ages, the flux level at which the clean algorithm halts 
must be chosen carefully. If the cutoff is set too far above 
the noise then the residual images will be dominated by 
sidelobe patterns. If it is too low clean will inadver- 
tently identify noise-spikes and sidelobes as real emis- 
sion. Both negative and positive clean components on 
sidelobes will result in flux being subtracted from the 
positions of real sources and can artificially lower the 
RMS noise. The AutoWindow function described in Sec- 
tion [321 has been shown to reduce the clean bias. How- 
ever, we chose a relatively low clean cutoff when imag- 
ing CORNISH data and need to measure the bias level 
in order to evaluate the correct flux-densities and uncer- 
tainties for the CORNISH catalogue. In a similar test 
to the one presented in Section 15.51 we inserted twenty 
point sources into the M^data for each of tiles 43, 273, 
529 and 1248. The flux densities were set randomly be- 
tween 2 and 20 mjy. The data were imaged and mosaiced 
using the CORNISH pipeline, and aperture photometry 



Table 2: Completeness limits of tiles spanning a repre- 
sentative range of noise levels. 
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was used to recover the artificial source fluxes. We found 
that the clean bias was consistent across all four tiles, de- 
spite being representative of different epochs. We adopt 
a mean clean bias of A^4 pca k = — 0.94tr s k y (typically 
0.33 mjy), indicating a moderate level of over-cleaning 
compared to NVSS, which quotes AA poa k = — 0.67u. We 
judge that this will not affect the utility of the catalogue. 

5.7. Manual quality control 

According to the results of Section 15.41 we do not ex- 
pect to find significant numbers of spurious sources above 
la in well-behaved CORNISH data. This statement is not 
necessarily true of high- noise fields containing bright and 
extended emission associated with massive star-forming 
complexes. Occasionally, peaks in the rippled noise pat- 
tern may be mistaken for real emission, or calibration 
errors may conspire to create false sources. To allevi- 
ate this problem we visually inspected all high-reliability 
CORNISH detections (i.e., those peaking above la) to 
assess them as potential artefacts. 

The CORNISH team visually inspected all mosaic tiles 
and individual sources in the la catalogue. All sources 
were classified as being either 'unlikely', 'possibly' or 
'likely' an artefact based on the criteria above, i.e., lo- 
cated on a peak in a high noise ripple region, near a 
very bright source, or in a region where there appears 
to be an excessive number of potentially spurious 5 - la 
sources (see Figure [T5|) . If a source suspected of being 
possibly or likely an artefact was found to have a radio 
or infrared counterpart then, of course, the flag was left 
as 'unlikely' in the CORNISH database. Smaller UCHu 
regions lying within the noise radius of a much brighter 
emission often have counterparts in the GLIMPSE IRAC 
data, while planetary nebulae (PN) are often seen in the 
UKIDSS data, confirming them as real detections. Both 
source types appear in the far-infrared MIPSGAL bands 
(24 fxm and 70 /Ltm). Real extragalactic sources are not 
likely have counterparts in the infrared datasets and so 
retain their possible-artefact flag in suspect regions. 

6. Results 

We found 3,062 sources in the CORNISH data above a 
la detection threshold. Of these, 2,591 were well fit by 
model Gaussians and the remaining 471 sources required 
measurement using a hand-drawn polygonal aperture. A 
total of 286 and 138 sources were classified as 'possible' 
or 'likely' artefacts, respectively, and a flag set in the final 
high-reliability catalogue. They remain available in the 
on-line catalogue and users will be able to include pos- 
sible and likely artefact sources in their searches. Below 
we present the new, high-reliability catalogue of 5 GHz 
radio-emission containing 2,638 sources. 

6.1. Catalogue format 

Isolated and unresolved sources identified by the source 
finder have two recorded entries taken from fitted Gaus- 
sian parameters and aperture photometry measurements. 
Sources exhibiting structured and extended emission 



have a single entry, based on aperture photometry per- 
formed using a manually drawn polygon. When assem- 
bling an aggregate catalogue we favoured the Gaussian 
fitted values. The photometric measurements are useful 
for diagnostic purposes. 

An excerpt from the final CORNISH catalogue is pre- 
sented in Table [3l The columns are as follows: col- 
umn (1) contains the CORNISH source name, constructed 
from the Galactic longitude and latitude of the source. 
The equivalent right-ascension (a) and declination (S) 
are displayed in columns (2) and (3), respectively. For 
sources well fitted by Gaussians the adopted coordinates 
are simply the peak positions of the fits. The intensity 
weighted position is quoted for extended sources mea- 
sured using a polygonal aperture. The associated po- 
sitional uncertainties are given in columns (4) and (5). 
Two uncertainty values are quoted for catalogue entries. 
The first value is the absolute uncertainty incorporating 
both measurement and calibration errors. The second 
value (in brackets) is the error associated with the pho- 
tometry or Gaussian fit alone. Column (6) presents the 
peak flux density in units of mJybeam -1 . The 5 GHz in- 
tegrated flux density (Ssghz) is presented in column (7). 
Column (8) contains the measured angular-scale of the 
emission 9f , which has been determined from the geomet- 
ric average of the major and minor Gaussian fit axes, or 
the intensity-weighted diameter in the case of extended 
emission. Sources with 9 S > 1.8 are considered to be 
resolved in the CORNISH images and their deconvolved 
sizes are presented in column (9). The local RMS noise 
measured from the photometric sky-annulus is recorded 
in column (10). Column (11) notes how the flux den- 
sity was measured, either with an polygonal aperture, or 
using the Gaussian fit. Finally, column (12) contains a 
range of flags notifying the reader if the source is: 

• within 12" of another source, 

• lying on an unusually high- noise region (RMS> 
0.45 mjy beam" 1 ), 

• imaged using the smoothed weighting scheme de- 
scribed in Section [3.2.21 (within 4.45' of a field cen- 
tre, i.e., half a primary beam FWHM), 

• within 3' of a bright (> 0.5 Jy) source, 

• within 2' of the edge of the survey, 

• within an area containing numerous low signal-to- 
noise detections likely to be spurious, 

• overlaps with another source, 

• or has been flagged as a suspected artefact during 
manual inspection. 

The flags are described in more detail in the footnotes to 
Table [3 

The full CORNISH catalogue is offered to the astro- 
nomical community as a plain text file or VO-table (XML 
based table format defined by the Virtual Observatory) 
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Fig. 17. — Distribution of source angular sizes for the high- 
reliability catalogue. 
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Fig. 18. — Distribution of CORNISH sources as a function of 
Galactic latitude [upper panel) and longitude [lower panel). 
In both panels the high-reliability catalogue (a > 7) is plotted 
using solid shading, while the hatched histogram contains a 
sub-sample of resolved sources [6f > 1.8"). 



on the project website ( |http://cornish.leeds.ac.uk| . A 
query based web interface is also available which allows 
the user to retrieve specific catalogue subsets and drill 
down to the underlying data. 

6.2. Ensemble source properties 

6.2.1. Angular size 

The measured angular size t9f quoted in the catalogue 
is given by the geometric average of the major and mi- 
nor fitted Gaussian FWHM axes (V^m #m), or by the 
intensity- weighted diameter [()&) in the case of emission 
measured using a polygonal aperture. For a bright source 
with a Gaussian morphology these measurements are 
equivalent within the errors. Figure [T7] plots the distri- 
bution of measured angular sizes for the high-reliability 
catalogue. The distribution begins to flatten at size- 
scales greater than 5 arcseconds, before tapering out at 
30", where the upper-limit for the source-fitter is set. 
Above sizes of 14" the deconvolution algorithm strug- 
gles to model the poorly-sampled longer wu-spacings (see 
Section l3.2.2[) . hence the slight increase in counts at that 
scale - broad sources are artificially truncated at sizes of 
14". 

The uncertainty on the angular size of CORNISH 




1 10 100 1000 10 

Flux Density (mjy) 




1 10 100^ 1000 

Peak Flux (mjy beam ) 



Fig. 19. — Distribution of integrated flux density [upper 
panel) and peak flux [lower panel). The high-reliability cat- 
alogue is illustrated by the solid-shaded histogram, while 
the hatched histogram contains only the subset of resolved 



sources. 



sources is better than 0.3" for 96 percent of compact 
(6{ < 5") catalogue entries. We consider sources with 
8f < 1.8" (i.e., the restoring beam size plus 0.3") to be 
unresolved. Sixty-one percent of the 7a catalogue fall 
into this category. Below we examine the differences 
between the resolved and unresolved populations. 

6.2.2. Galactic distribution 

Figure \M illustrates the distribution of CORNISH 
sources as a function of Galactic latitude {upper panel) 
and longitude (lower panel). The solid-shaded histogram 
contains all sources in the high-reliability catalogue, 
while the hatched histogram contains only the subset 
of resolved detections (859 sources). Resolved sources 
account entirely for the broad peak seen in the latitude 
distribution, with the remaining unresolved detections 
exhibiting a flat profile. The scale-height of the re- 
solved latitude distribution is 0.47°, consistent with that 
of UCHu regions, 6.7 GHz methanol mase rs and other 
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20091 lUrquhart et~aH 120071 l2009i |201lh . The supposi- 
tion that a large fraction of the resolved sources arise 
in high-mass star-forming regions is lent weight by their 
Galactic longitude distribution. The number of sources 
per 2° bin increases gradually towards longitude zero, 
while the two spikes at I f=s 43° and I « 50° corre- 
spond to the W49 and W51 complexes, respectively. 
Conversely, unresolved detections (1,719 sources) show 
a flat distribution with Galactic longitude and are likely 
to contain significant numbers of active galactic nuclei 
(AGN), and other extragalactic sources. We note that 
this is partly by design as course adjustments were made 
to the deconvolution algorithm in order to keep the num- 
ber of low- level sources roughly constant (Figure |3|). The 
expected density of extragalactic sources in CORNISH 



may b e calculated from Equation A2 of Anglada et aL 
(1998), using the 5 GHz source counts of lCondonl (|l984f ). 



Assuming median values of RMS-noise for the two re- 
gions presented in Figure [11] we expect to find ~ 2400 
extragalactic sources in our 7a catalogue, consistent with 
the number of unresolved detections. Most extragalac- 
tic sources can be classified as they are not detected in 
any of the infrared wavebands. Specific catalogues of 
CORNISH sources identified as UCHu regions, PNe and 
AGN will be presented in a forthcoming paper. 

6.2.3. Flux density and peak flux 

Figure [19] shows the distributions of flux densities and 
peak fluxes for the CORNISH sources. At flux density 
levels of ~ 5mJy or greater the distribution is well fit- 
ted with a power-law of index —0.81. Below 3mJy the 
number of sources begins to decrease as the 7a detection 
limit is encountered. The distribution of flux densities 
for resolved sources turns over at approximately 6 mJy 
due to the constraints imposed by their selection and the 
7a signal-to-noise cutoff. 

The peak flux distributions are identical for the high- 
reliability and the resolved source catalogues. Above the 
sensitivity cutoff (~ 2.5 mJy beam -1 ), both are fit by a 



6.3. Comparison to other catalogues 

Of all prior observations the IWhite et ail (|2005l ) VLA 
survey of the Galactic plane has the most similar observ- 
ing setup and sky-coverage. A comparison to that work 
serves as a useful sanity-check on the ensemble properties 
of the CORNISH catalogue and images. The 5 GHz com- 
ponent of the White et al. survey was observed using the 
D, DnC and C arrays. These more compact VLA configu- 
rations (compared to B and BnA) yield better sensitivity 
to extended em ission than CORNIS H, but at a lower res- 
olution (~ 6"). IWhite et all (|2005l ) imaged the Galactic 
plane between -10° < I < 42° and \b\ < 0.4°, of which 
25.6 square degrees overlap with the CORNISH target 
area. The measured noise properties of their images are 
lower, with a median RMS of ~ 0.27mJy beam -1 com- 
pared to ~ 0.35 mJy beam -1 for CORNISH data. The 
cutoff limit for the White et al. source catalogue was 
chosen to be 5.5cr (~ 1.4mJybeam -1 ) compared to 7a 
(~ 2.5 mJy beam -1 ) in this work. We would expect simi- 
lar flux densities for compact sources (< 6", see Figure[4| 
common to both catalogues, despite differences in uv- 
coverage. Systematic errors present in either catalogue 
should be obvious in a flux-flux comparison plot. 

The White et al. 5 GHz catalogue contains 1822 en- 
tries in the overlapping area and we match 558 of these 
with 521 CORNISH sources using a 5" search radius. The 
number of matches diminishes significantly at matching 
radii greater than 2", however, a 5" matching radius was 
chosen to allow for offsets in the positions assigned to 
resolved sources in both catalogues. Figure [50] presents 
a comparison of the measured flux densities for sources 
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Fig. 20. — Comparison of the 5 GHz flux dens ity measure- 
ments for sources common to the CORNISH and I White etafl 
(2005) catalogues. No systematic differences are apparent in 
the plot, however, the absolute differences between the two 
catalogues increase with flux density. The points are colour- 
coded to show the angular size of the CORNISH detections 
and it is clear that the most extended sources are responsible 
for the outliers seen above ~ 100 mJy. 
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Fig. 21. — Comparison between 5 GHz continuum images from the IWhite et all (|2005l ) VLA survey (top) and CORNISH 
(bottom). Panel (i) shows a radio-galaxy detected in both surveys. The unresolved central driving source is detected in 
CORNISH while the extended emission from the radio- lobes is resolved out. Similarly, emission on scales of ~ 30 arseconds is 
filtered from the cometary Hll region G24. 799+0. 097, shown in panel (ii). However, CORNISH does an excellent job of imaging 
structures on scales less than ~ 14 arcseconds. Panel (iii) shows the star-forming region G34.26+0.15. The detailed structure 
is well-imaged by CORNISH. In comparison the White et al. survey struggles to resolve the individual knots of radio emission 
and significant imaging artefacts are present in the image. The two compact sources at the centre of panel (ii) are not detected 
in the White et al. survey. 



successfully cross-matched between the two surveys. The 
measurements agree, on average, to within 39 percent, 
with no evidence of systematic differences. A greater- 
fraction of sources in the high flux density bins are re- 
solved, hence the outliers in the plot above ~ 100 mJy 
may be attributed to imaging and measurement differ- 
ences between the two surveys. Points representing in- 
dividual sources in Figure [30] are colour-coded to indi- 
cate angular size in the CORNISH catalogue. Unresolved 
sources (blue) cluster around the equality line while the 
outliers are almost all extended (red). 

In total, 1264 sources in the White et al. catalogue 
remain unmatched using a simple cone search within 
five arcseconds. Of these, fifty percent lie above our 7<j 
sensitivity threshold and are sufficiently bright to have 
been detected in CORNISH. The reasons for the dis- 
parity become apparent upon comparing the White et 
al. and CORNISH images. A significant fraction of the 
bright, unmatched sources have angular scales greater 
than ~ 20" in the White et al. data and are sim- 
ply resolved out by the VLA B configurations used by 
CORNISH. Figure [21] (panels i and ii) presents examples 
of such objects. The central source powering the radio- 
galaxy shown in panel (i) is detected in both surveys 
as a 25 mJy point source. The radio-lobes have angular 



scales of ~ 30" in the White et al. image and the brighter 
southern lobe has a peak flux density of 14mJybeam _1 . 
In the corresponding CORNISH image the northern lobe 
is completely resolved out, while the southern lobe is de- 
tected at a 5.5ct level and is therefore not included in 
the high-reliability catalogue. Similarly, the cometary 
Hn region G24. 799+0. 097, shown in panel (ii), is re- 
solved into multiple components by CORNISH. When 
assembling the CORNISH catalogue we took great care 
to identify such over-resolved emission as a single source 
(see Section [5~2]l . In the White et al. catalogue individ- 
ual Gaussian fits to complex emission are left separate. 
G24. 799+0. 097, for example, has four catalogue entries 
and a 15" matching radius is required to correctly match 
these to their CORNISH counterpart. 

Panel (iii) of Figure [21] presents an image of the 
G34.26+0.15 star- formation region. G34.26+0.15 is di- 
vided into three components (a, b and c), of which c is 



the p rototype cometary UCHu region (jvan Buren et al 
1990). The CORNISH image clearly resolves all three 



components with excellent image fidelity. By contrast 
the White et al. image barely resolves the c component 
and contains significant numbers of imaging artefacts. 

The remaining unmatched White et al. sources derive 
from intrinsic differences in the image quality between 
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Fig. 22. — Sample CORNISH images (7e/t column) alongside Spitzer GLIMPSE (mtddle column) and MIPSGAL (right column) 
infrared data. The green polygon illustrates the aperture used to measure the properties of the 5 GHz emission. In the three- 
colour GLIMPSE imaees the 8.0 um band is coded red. the 4.5 um ereen and 3.6 um blue. 



the two surveys. In general, the CORNISH images have 
more homogeneous noise properties and are of higher 
quality. The White et al. images contain numerous 
compact sites of emission not present in the equivalent 
CORNISH mosaics. These often occur in regions adjacent 
to very bright sources or poorly imaged extended emis- 
sion, and are themselves generally unresolved and weak 
(90 percent < lOmJy beam -1 ). Their morphology and 
location makes them likely to be artefacts of the imaging 
process. Examples of such artefacts are visible in panel 
(iii) and, to a lesser extent in panel (ii) of Figure [5TJ 
Artefacts in the CORNISH image are limited to a mod- 
erate level of ripple and few noise-spikes. By comparison 
the White et al. images often contain significant side- 
lobe structure and a number of spurious emission sources. 
The advantages of utilising a semi-automatic pipeline are 
apparent in the high-quality of the CORNISH images. In 
general, such fields also illustrate the importance of man- 
ually inspecting the results of automatic source-finders 
when dealing with under-sampled interferometric data. 

There are 262 CORNISH sources which have no coun- 
terpart in the White et al. catalogue, despite peaking 
well above the nominal 5.5<r detection limit. A small 
fraction of these are pathological cases, like the two com- 
pact sources at the centre of the CORNISH image in panel 
(ii) of Figure [5TJ The sources have flux densities of 12 
and 20mJy and should be visible in the White et al. 
image. A linear discontinuity cuts through the image 
at this position, so we speculate that the omission de- 
rives from a proble m with the imaging process used by 



White et al.1 (|2005h . The majority of the remaining un- 
matched CORNISH sources are detected below the IOct 
level and when present in the White et al. images are 
washed out by ripples in the noise or other imaging arte- 
facts. 

7. Example CORNISH data 

The CORNISH dataset contains objects of many types, 
including Hn and UCHii regions, PN, evolved stars, 
active binaries, radio lobes from external galaxies, and 
many AGN and quasars. Figure [22] presents sample 
CORNISH images for known objects opposite their coun- 
terpart data from the Spitzer GLIMPSE and MIPS GAL 
surveys. The first column of images shows the CORNISH 
data, the second a three-colour image made from the 
mid-infrared GLIMPSE IRAC bands and the third col- 
umn the 70 (im MIPSGAL image. The first two rows 
present examples of resolved Hn regions with differ- 
ent morphologies. G013. 8726+00. 2818 is a classical 
cometary Hn region, while G018. 3024-00. 3910 is ir- 
regularly shaped. For most resolved Hn regions in 
CORNISH the shape of the 5 GHz continuum emission 
is echoed and extended in the GLIMPSE three-colour im- 
age. The 3.6 /xm and 8.0 fim bands (coded blue and red, 
respectively, in Figure [2"2")l contain broad lines from poly- 
aromatic-hydrocarbons (PAHs), wh ich are excited by th e 
strong ultraviolet radiation field ( Peeters et al.l l2002h . 
Galactic massive star-forming regions are readily iden- 



tifiable via the appearance of the filamentary PAH emis- 
sion surrounding them. The extended flocculent emis- 
sion (appearing purple in the GLIMPSE colour coding 
in Figure [22]) traces the clumpy photodissociation region 
(PDR) at the interface between the ionised gas and the 
enveloping molecular cloud. The spectral-energy distri- 
bution of H ii regions peaks in the far-infrared and they 
are easily detected as a bright source in the MIPSGAL 
70 ^m images. 

The third row shows a particularly good example of a 
resolved planetary nebulae (PN). The GLIMPSE colours 
of PN are similar to those of the Hn regions, but the 
SED falls off more steeply in the far-infrared, hence the 
MIPSGAL 70 ^m band is noticeably less bright. PNs 
tend to be isolated objects in the GLIMPSE images, hav- 
ing long since dispersed their natal molecular clouds. 
The expanding shell of gas surrounding the PN also 
contains PAHs, which are excited by the ultraviolet 
photons generated by the central stellar remnant (see 
Smith fc McLean! 2008 and references therein). Unlike 
the Hn regions, the PAH emission is confined to the 
ejected envelope leading to simple mid-infrared morphol- 
ogy. G051. 5095+00. 2686 in Figure 1221 exhibits a similar 
ring-shape in both CORNISH and GLIMPSE images. 

The final two rows present examples of radio-galaxies 
in which the radio lobes have been resolved. In 
G057. 3066+005467 the central driving source is not de- 
tected and the lobes are barely resolved as two tear-drop 
shaped sources extended towards each other. The cen- 
tral driving source of G060. 7862— 00.6360 is detected as 
a point source near the centre of the image and both 
radio-lobes are well-resolved, if weak. Neither radio- 
galaxy has a counterpart in any of the associated mid- 
or far-infrared images. 

8. Summary and future work 

The CORNISH project has delivered the best ever com- 
plementary radio view of the northern GLIMPSE region 
at 5 GHz (6-cm wavelength). With a resolution of ~ 1.5" 
and a RMS noise level of < 0.4 mJy beam -1 , the survey 
is tailored to search for UCH n regions across the Galaxy, 
but has also detected a wide range of radio-bright objects 
that are also identified in other categories. 

We present here a catalogue of 3,062 compact radio 
sources detected in CORNISH data above a 7a signal-to- 
noise threshold. A high-reliability subset (2,638 sources) 
contains has been flagged to exclude potential spurious 
detections in poorly imaged iw-data. Fields containing 
emission extended on scales greater than 14" are poorly 
sampled by the u-y-coverage of the VLA B-configuration, 
giving rise to a small number of spurious sources. Such 
fields represent only two percent of the survey area and 
a rigorous program of manual inspection has flagged sus- 
pected artefacts, hence, we estimate the catalogue re- 
liability to be better than 99 percent. To date, the 
CORNISH catalogue is the most uniformly sensitive, ho- 
mogeneous and complete list of compact radio-emission 
sources at 5 GHz towards the northern Galactic plane. 



Mosaiced images and calibrated uu-data in FITS for- 
mat are available to download from the CORNISH web- 
site ( |http://cornish.leeds.ac.uk| . We have created a data 
server, which is operated by submitting a list of positions 
and serves either postage-stamp images or calibrated uv- 
data. The full CORNISH catalogue is also available online 
via a query based interface, as a plain-text format file, 
or a VO-table. General access is also available through 
the VizieR service. 

Much work remains to be done in order to fully ex- 
ploit the CORNISH dataset. A future paper in the survey 
(Purcell et al., in prep) will cross-match the 5 GHz ra- 
dio emission to the complementary Spitzer GLIMPSE and 
UKIDSS datasets, allowing the identification of specific 
source types via their SEDs. 
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Table 3: 5 GHz sources in the CORNISH catalogue. 
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Note. — Two values are quoted for the uncertainty on all parameters. The first value is the absolute uncertainty, including measurement 
and calibration errors. The second value, in parentheses, is the uncertainty on the measurement alone. 
"Sources with 8 B < 1.8" are considered unresolved in the catalogue. We note that the 1.8" limit is only ~ 2 sigma from 1.5" for the weakest 
sources, which may cause some weak and unresolved sources to be labelled as resolved 

6 The flux density of sources marked with an 'P' in column (11) was measured using polygonal apertures drawn by hand on the images, 
while a 'G' means the flux density and peak-flux measurements were taken directly from the Gaussian fit. 

c The flag codes in column (12) have the following meanings: C = the source is part of a cluster, i.e., within 12" of another source; E = the 
source is within two arcminutes of a survey edge; N = the source lies within a high- noise region (RMS> 0.45 mjy); B = the source lies within 
3' of a bright (0.5 Jy) source; W= u^data for one or more fields contributing to a source was imaged using a smoothed weighting scheme; 
7 = the source overlaps with another 7<x catalogue source; 5 = the source overlaps with a 5 - 7cr source; S = the source is located in a region 
with a high concentration of 5 - 7<r sources. 



